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THE  INVERSE  PROBLEM  AND  THE  PSEUDO-EMPIRICAL  ORTHOGONAL  FUNCTION 
METHOD  OF  SOLUTION  PART  1:  THEORY;  PART  2:  APPLICATION 


THEORY 


1.1  Introduction 


Many  schemes  have  been  developed  to  solve  the  so-called  "inverse 
problem."  However,  it  remains  a  fact,  regardless  of  the  scheme,  that 
normally  the  information  content  in  a  given  set  of  measurements  is 
severely  limited.  Therefore,  our  recoverable  knowledge  of  the  unknown, 


if  deduced  solely  from  the  measurements ,  is  also  going  to  be  severely 
limited.  The  difference  between  the  various  inversion  schemes  is  pri¬ 
marily  due  to  the  additional  information  that  the  set  of  equations  is 
given.  This  additional  information  is  normally  in  the  form  of  "phys¬ 
ically  plausible"  constraints. 

The  list  of  methods  is  too  long  to  repeat  here  [the  interested 

i  2 

reader  can  be  referred  to  Twomey,  Dei rmendjian,  and  Bottiger  J,  but 
it  follows  that,  in  any  method  of  solution  employing  one  or  more  con¬ 
straints,  the  final  solution  will  depend  to  some  degree  on  the  valid¬ 
ity  of  the  constraint  for  the  particular  problem  and,  therefore,  is 
not  completely  objective. 

Given  a  set  of  measurements ,  g. ,  i  =  1,  2,  •••  m,  the  governing 
equation  for  the  inverse  problem  can  normally  be  written  as  a  Fredholm 


integral  of  the  first  kind: 


u 

gi  =  f  k.(x)  f(x) 


where  k. (x)  is  the  l  kernel  of  the  problem,  and  f(x)  is  the  unknown. 
The  measurement  g.  is,  therefore,  a  dot  product  in  the  function  space 

A.  L. 

between  the  iLn  kernel  and  f(x).  Therefore,  the  part  of  the  solution, 
f(x),  that  can  be  recovered  from  the  measurements  must  lie  within  the 
function  space  spanned  by  the  kernels.  Any  solution  or  component  of 


OK 
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a  solution  which  is  outside  this  function  space  requires  additional 
information  (constraints)  and/or  assumptions,  in  order  to  be  recovered 
The  presence  of  measurement  errors  and  uncertainties  in  the  mathemat¬ 
ical  model  and  the  physical  processes  represented  by  the  kernels  act 
to  further  increase  the  unrecoverable  part  of  the  solution. 

Equation  (1)  can  be  approximated  in  some  fashion  by  a  discrete  sum 
such  that  f(x)  is  calculated  at  some  set  of  x's,  and  therefore  may  be 
rewritten  in  a  matrix  form: 

g  =  A  f  (2) 

>  ~ 
where  g  is  an  mxi  vector  made  up  of  the  set  of  measurements;  A  is  an 

mxn  matrix  representing  the  kernel  and  may  contain  weighting  factors 

which  depend  on  the  quadrature  formula  used  for  converting  from  an 

integral  to  a  finite  sum;  and  f  is  the  unknown  nxl  column  vector 

whose  elements  are  f  ^ .  The  direct  solution  of  the  basic  Eq.  (2)  is 


when  the  A  matrix  is  not  square  where  T  denotes  the  transposed  matrix. 

All  methods  of  solution  for  Eq.  (2)  require  the  computation  (directly 

~T~  4 

or  indirectly)  of  the  inverse  of  the  matrix  A  A.  The  instability  as 
a  result  of  the  inverse  operation  and  its  relation  to  the  eigenvalues 

5  0 

and  eigenvectors  of  the  matrix  A  A  will  be  demonstrated  as  follows.  * 
Let  the  exact  equation  to  be  solved  be  written  as: 

b  =  Gf 

where  b  is  the  transposed  measurement  vector  A  g  and  G  is  the  A  A 


(4) 


matrix  obtained  from  Eq.  (2)  after  premultiplication  by  %  .  If  a 

■> 

small  perturbation  6b  is  assumed  in  the  measurements  and  the  result- 
ing  perturbation  6f  is  estimated  such  that 


>  -*•  ~ 


b  +  6b  =  G(f  +  6f), 

then  subtracting  Eq.  (4)  from  Eq.  (5)  yields 


6b  =  G6f 


6f  -  G  6b . 


The  matrix  G  is  a  symmetric  and  real  matrix,  so  that  its  eigen¬ 
values  A.  are  real  and  positive  and  0  <  A1  <  a2  •••  <  An-  The  eigen- 
values  of  G  are  0  <  1/A^  <  1/An  •••  1/A^  and  the  eigenvectors  of 
G  and  G”  are  u1§  u2  •••  un.  Assume  now  that  any  perturbation  vector 
6b  can  be  written  as  a  linear  combination  of  the  eigenvectors  u^ ,  i  = 
1,2  •••  n.  For  the  specific  case  for  which  6b  =  eu^ ,  where  e  is  the 
magnitude  of  6b,  Eq.  (6)  may  be  written  as: 

6f  =  G  eu..  (7 

^  'v'_  1  ■>  *► 

By  using  the  eigenvalue  equation  for  G  ,  G  u^  =  ( 1/ ) u ^  yields: 


6f  =  ~  6b.  (8 

i 

The  error  of  magnitude  j6b|  is  amplified  by  a  factor  1/A^ ,  which  can 
be  very  large  when  A.  is  close  to  zero  for  a  nearly  singular  matrix. 
Furthermore,  from  Eq.  (4),  it  follows  that  f  =  G  b,  and  for  b  =  au^, 
where  a  is  the  magnitude  of  b,  the  minimum  magnitude  for  f  will  be 

+  1  * 

(9 


where  A  is  the  largest  eigenvalue. 


Combining  Eqs.  (8)  and  (9)  yields  an  estimate  for  the  relative  error: 


Sf  \  \ 

4  <  - 


which  says  that  the  relative  error  in  f  is  less  than  or  equal  to  the 
relative  error  in  b  times  the  ratio  of  the  largest  to  the  smallest 
eigenvalue  of  G. 

By  improving  the  quality  of  the  measurements  and  the  model,  the 
magnitude  of  the  error  j 6b j  =  e  can  be  controlled.  However,  it  is 
important  to  note  that  the  direction  of  6b  is  outside  of  control.  The 
probability  always  exists  that  some  of  the  error  vector  will  be  in  the 
direction  of  u.,  and  a  very  large  error  in  the  solution  f  will  result. 

If  it  is  desired  to  include  in  the  error  magnification  analysis  the 
combined  effect  of  error  in  the  measurements  and  error  in  the  kernel 
function,  Twomey1  (pp.  207  -  210)  can  be  followed  to  solve  for  the 
case  in  which  r.  is  an  error  measurement  vector  that  obeys  normal  addi¬ 
tive  statistics,  e.  =  N(0,e2gi-2);  N  represents  normal  statistics,  and 
e  is  a  fraction  error  in  g. .  Letting  be  the  error  in  the  kernel 

k.j(r),  which  obeys  normal  additive  statistics,  n^r)  =  N[0,P2ki 2(r)], 
where  P  is  a  fraction  error  in  k.(r)  and  there  is  no  correlation  between 

l v  ' 

e  and  n.  The  worst  case  of  relative  error  magnification  can  be  deter¬ 
mined  if  the  problem  is  normalized  such  that 


i  j  u 

J  f 2 ( x )  dr  =  1,  and  J  k.2(x) 


dx  =  1,  i  =  1,2  •••  m  to  be 


|5(f)|2/|f+|2  A„  P2A 


A  PA  A 

=  -n-  +  — n—  +  p2  -A 


[  M2/  lg|2]  a1  e\2 


mssmmsism 


or  the  best  case  of  relative  error  magnification  is  expressed  by: 


l«(f)|2/|f|2  *n  |  P2  p2  \ 

Cl-i2/|g|2]  \  e2xl  P  xn  ‘ 


(12) 


Both  Eqs.  (11)  and  (12)  give  Twomey's  results  when  P  +  0  (no  error  in 
the  kernels). 

The  problems  of  more  unknowns  than  available  independent  equations 
and  of  inherent  instability  arise  when  a  solution  of  the  basic  equa¬ 
tion,  Eq.  (2),  is  attempted.  It  is  important  to  note  that,  even  when 
the  number  of  measurements  equals  or  exceeds  the  number  of  unknowns, 
due  to  errors,  the  resulting  equations  are  not  all  independent.  To 
supply  more  independent  equations,  relationships  betweer  the  unknown 
solution  points  can  be  assumed  or  assumptions  about  other  properties 
of  the  expected  solution  can  be  made. 

If  no  constraints  are  available  or  if  those  available  are  insuf¬ 
ficient,  no  solution  with  finite  error  bounds  can  be  found.7  The 
additional  assumptions  (constraints)  can  be  mathematical  (weighting 

functions)  or  physical  (additional  assumed  physical  characteristics) . 

0 

In  some  cases,  the  constraints  can  be  hidden.  As  an  example.  Smith 
required  that  the  solution  be  a  linear  combination  of  the  weighting 
functions,  and  Chanine  ’  used  linear  interpolation  between  the 
unknown  solution  points  (temperature  at  different  elevations)  as 
noted  by  Rodgers.7  The  constraints  can  be  explicitly  stated,  as  did 

1112  13  14 

Twomey,  ’  Fleming  and  Wark,  Wark  and  Fleming,  Herman  et  al., 
and  others. 

The  appl icabi 1 i ty  of  the  constraint  to  a  given  problem  is  very 
important  because  the  additional  set  of  constraint  equations  serves 


as  virtual  measurements.  If  the  proposed  constraints  do  not  properly 
describe  the  physics  of  the  problem,  to  use  them  would  be  tantamount 
to  adding  measurements  with  large  errors.  For  example,  if  on  a  par¬ 
ticular  day  the  aerosol  size  distribution  was  a  very  erratic  function 
and  a  smoothing  constraint  was  applied,  the  result  would  be  two  sets 
of  contradictory  equations.  The  quality  and  type  of  the  constraint 
applied  are  reflected  in  the  solution  and  make  it  nonobjective. 

Frequently  the  solution,  f(x),  is  used  to  calculate  different  prop¬ 
erties  of  f(x)  [ i.e .,  moments  of  f(x)]  or  other  derived  properties  as 
an  input  for  other  models.  Therefore,  it  is  important  to  know  how 
sensitive  the  measurements  are  to  these  computed  properties  and  the 
standard  deviation  of  the  solution.  If  the  measurements  are  not  sensi¬ 
tive  to  these  properties,  then  these  computed  properties  are  not  neces¬ 
sarily  appropriate  for  use  in  modelling.  Analysis  of  the  kernels  can 
give  information  about  types  of  solutions  and  properties  of  the  solu¬ 
tions  to  which  the  measurements  are  sensitive,  as  will  be  shown  later. 

For  those  inverse  problems  for  which  there  exists  a  large  body  of 
observed  solutions  [distribution  functions  f(x)],  such  as  temperature 
soundings,  ozone  vertical  distributions,  etc.,  one  can  make  use  of  this 
large  reservoir  of  observed  "solutions"  using  the  method  of  empirical 
orthogonal  functions.  '  In  this  method,  the  observed  information 
is  put  into  the  form  of  a  matrix  from  which  the  eigenvectors  are  deter¬ 
mined.  From  these  eigenvectors,  a  set  of  orthonormal  basis  functions 
may  be  constructed,  which  are  used  as  additional  information  to  con¬ 
strain  the  unknown  solution  to  be  composed  of  a  linear  combination  of 
these  observed  distributions.  It  is  inherently  assumed  in  this  method 
that  the  orthonormal  basis  functions  form  a  complete  set. 


For  those  problems  for  which  we  do  not  have  any  reliable  library 
of  observed  "solutions,"  e,$.,  when  f(x)  is  an  aerosol  size  distribu¬ 
tion,  the  basis  functions  can  be  constructed  from  a  library  of  mathe¬ 


matical  functions.  Although  these  mathematical  functions  do  not 
describe  existing  functions  f(x),  they  can  produce  orthonormal  basis 
functions  from  which  many  types  of  anticipated  solutions  can  be  con¬ 
structed.  These  orthonormal  basis  functions  will  be  called  pseudo- 
empirical  orthogonal  functions,  as  their  source  is  from  a  library  of 
mathematical  functions.  Inherently  assumed  in  the  method  is  that  the 
library  of  assumed  functions  can  be  used  in  linear  combinations  to 
yield  any  real  function  which  describes  the  unknown  f(x). 

The  approach  of  expanding  an  unknown  function  as  a  linear  combi¬ 
nation  of  orthonormal  basis  functions  can  be  found  in  many  mathematical 
reference  books.  ’  In  this  work,  this  approach  will  be  applied  for 
solving  the  inverse  problem. 

1.2  Method  of  Solution 

It  is  assumed  that  the  unknown  solution  function  f(x)  can  be  con¬ 
structed  from  a  linear  sum  of  orthonormal  basis  functions  $(x)  with 
coefficients  ai ,  i  =  1,2  •••  p, 

P 

f (x)  =  y  vi(x)  •  ( 13 ) 

i  =1 

The  validity  of  this  assumption  will  be  discussed  further  on.  The 
coefficient  set  a.  can  be  represented  in  matrix  notation  by  a  column 
vector  a(pxl)  whose  elements  are  a  ■ .  The  basis  functions  ^(x),  i  = 

1.2  •••  p,  can  be  written  in  matrix  notation  as  a  column  vector  $(x), 
(p*l),  where  each  element  is  <j>.(x)  and  x  is  a  continuous  variable.  If 

13 


iMtaacaai^^ 


x  is  discretized  (x1  to  x  ),  the  basis  functions  can  be  written  as  a 
matrix  ij»(pxn),  while  f  is,  as  before,  a  column  vector  (nxi).  By  using 
the  above  matrix  notation,  the  unknown  solution  can  be  written  as: 

*  ~T  + 

f  =  ♦a.  (14) 

Substitution  of  the  expansion  for  the  unknown  f(x)  [Eq.  (13)]  into  the 
basic  Eq.  (1)  yields: 

bp  p  b 

Si  ■/  a j  * j ( x )  dx  k i  ( x ) <j> ^.  ( x )  dx  (15) 

a  j=l  j  =1  a 

or  in  matrix  notation: 

g  =  Aa  ,  (16) 

where  now  the  A  matrix  elements  are  composed  of  the  various  terms 
of  the  integral  on  the  right-hand  side  of  Eq.  (15),  which  are  inner 
products  between  the  physical  kernel  of  the  problem  k..(x)  and  the 
mathematical  basis  functions  <j>j(x),  which  express  assumed  prior 
knowledge  about  the  unknown  function  f(x)  such  that 

b 

A..  =  J  k.(x)*.(x)  dx  .  (17) 

a 

The  condition  number  (the  ratio  between  the  largest  and  smallest 
eigenvalues  of  ft  A)  is  generally  large  and,  as  a  result,  the  solution, 
a  =  (ft^ft)-1ft^g  and  f  =  ^a,  is  unstable.  This  results  from  the  fact 
that  the  number  of  unknown  coefficients  (a-)  which  are  needed  for  the 
approximation,  Eqs.  (13)  and  (15),  is  usually  much  larger  than  the 
limited  number  of  independent  measurements.  As  a  result,  it  becomes 

14 
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necessary  to  employ  additional  sets  of  equations  in  the  form  of 
constraints  that  express  properties  anticipated  in  the  expected 
solution. 


The  first  constraint  employed  here  is  a  smoothing  constraint, 
since  f(x)  is  expected  to  be  smooth  for  most  low-resolution  measure¬ 
ments  of  .  A  smoothing  constraint  will  also  serve  as  a  low-pass 
filter  that  will  reject  high-frequency  oscillations.  Mathematically, 
high-frequency  components  usually  are  nearly  orthogonal  to  the  kernels 
and  are  therefore  undesirable  because  they  do  not  contribute  signifi¬ 
cantly  to  the  measurements. 

A  second-order  derivative  of  the  solution  with  respect  to  x  will 
be  used  as  a  measure  of  smoothness,  and  the  constraint  applied  will 
be  to  minimize  the  sums  of  the  squares  of  the  second  derivatives.  In 
matrix  notation,  the  smoothing  operation  can  be  written  as: 


a2f(*)  ~  ~t+ 

-Si-  =  S  *  a 


and 


E 


3"f(xi) 


=  a  4>S  S 4>  a  =  q2 


(18) 


where  S  is  an  operator  (matrix),  as  derived  by  Twoiney,1  which,  when 
+  ~T  + 

applied  to  f  =  <j>  a,  yields  the  second  derivatives  at  each  point. 
Letting  the  matrix  Js  S?  =  H<-,  the  second-order  smoothing  matrix, 
Eq.  (IB)  can  be  rewritten  as: 


q2  =  aTH$a  (19) 

where  q2  is  a  scalar  parameter  indicating  the  degree  of  smoothness  of 
the  solution,  f(x).  As  the  numerical  value  of  q?  becomes  smaller,  the 


solution  becomes  smoother. 


The  second  constraint  to  be  employed  is  the  condition  of  non-neg¬ 


ative  solution  points.  Unlike  the  property  of  smoothness,  which  is 
present  in  most  but  not  all  cases,  positivity  is  a  property  that  must 
be  observed  at  all  times  ( e.g . ,  when  f(x)  is  the  aerosol  size  distri¬ 
bution:  temperature  profile,  ozone  amount,  etc.). 

The  positivity  property  can  be  formulated  if  a  function  q3  is 
created  such  that 

q  (x.)  =  f2(x.)  -  y.f(x  )  y  >  0  (20) 

J  «J  J  J 

where  y.  is  the  jth  positive  component  of  a  suitably  chosen  vector  to 

J 

be  described  later.  Inspection  of  Eq.  (20)  and  the  graph  in  Fig.  1 


shows  that  the  function  q~(x.)  results  in  a  small  magnitude  for  posi- 

6  J 

tive  f(x.)  and  a  large  magnitude  for  negative  f(x.)*  The  slope  for 
J  J 

negative  values  of  f(x.)  is  much  steeper  than  is  the  slope  for  posi- 

0 

tive  f(x  ).  The  function  q  (x.)  produces  a  minimum  (a  single  minimum) 

J  J  J 

for  a  positive  value  of  f(x.)  =  y,/2.  The  constraint  of  minimizing 

J  J 

the  function  q  (x.)  will  push  the  solution  toward  y./2,  a  positive 
^  J  J 

number  since  y.  was  clearly  positive.  Equation  (20)  can  be  written  in 
J 

matrix  form  as: 

q3(xj)  =  aT$Ej£J$Ta  -  y^a7#^.  (21) 

where  E.  is  a  column  vector  (nxl)  in  which  all  the  elements  are  zero, 

vJ 

except  for  one  in  the  i*"^  element.  Equation  (20)  is  true  for  any  point 

x.  so  that  there  are  n  equations  represented  by  Eq.  (20)  for  all  points 
J 

*i  to  v 

The  Euler-Lagrange  method  will  be  employed  to  solve  the  problem 


where : 


|Aa  -  g|2  <_  q1  , 


where  q1  is  the  residual  sum  of  the  errors  squared  between  the 


+  ■>  ~T> 

puted  o  calculated  from  the  solution  f  =  j>  a  and  the  data  measurements 


g,  subject  to  the  constraints  of  minimizing  q  (Eq.  19)  and  of  mini¬ 


mizing  n  quantities  q  (x.)  for  the  n  equations  represented  by  Eq.  (21) 

^  J 


The  inequality  sign  in  Eq.  (22)  can  be  replaced  by  an  equality  sign, 


and  the  problem  becomes  one  of  holding  constant  the  quantity  q1 


qA  =  (Aa  -  g)T(Aa  -  g). 


subject  to  the  constraints  of  minimizing  q2  and  q  (j).  The  solution 

-► 

for  Eq.  (23)  and  Eqs.  (19)  and  (21)  is  the  vector  a,  such  that  for 


any  i ,  i  =  1,2  •••  p, 


k  +  +E  vk =  0 


where  is  the  smoothing  Lagrange  multiplier  and  is  the  j  posi 


tivity  Lagrange  multiplier  for  f(x)  at  x. 

J 


In  matrix  notation,  the  solution  for  a  (Eq.  24)  can  be  written  as: 


R'Kt  v's  vvl* 


fTrTl  1 [.I 


A  g  + 


-aiy.?E. 

2  J  J 


Equation  (25)  may  be  simplified  if  the  positivity  constraint  is  applied 


equally  to  each  solution  point  x.,  i.e.  ,  y  .  =  y  for  all  j’s.  Thus, 

J  PJ  K 


Eq.  (25)  can  be  rewritten  as: 


a  =  £TA  +  y,Hc  + 
b  s 


i  VVj*'  f  \  •  !26) 

j=l  J  *•  j=l 


Let  the  positivity  constraint  matrix  Hp  be: 


and  the  positivity  constraint  vector  h^(y)  be: 


+  * 
h 

P 


(y’> 


(28) 


j=l 


where  the  y  column  vector  (n*l)  elements  are  y.,  j  =  1,2  •••  n. 

J 


Combining  Eqs.  (26),  (27),  and  (28)  yields  the  final  form  of  the 
solution  to  the  basic  Eq.  (2) 


> 

a 


[*T 


A  + 


ySHS 


+  Y, 


Vp(y> 


(29) 


and 


f  = 


~T+ 
<!>  a 


(30) 


where  and  yp  are  undetermined  Lagrange  multipliers. 

Equations  (29)  and  (30)  are  solved  by  an  iterative  process.  It 
is  assumed  initially  that  any  given  x  in  f(x)  is  equally  likely  to 
be  present;  therefore,  a  flat  size  distribution  function  will  be  used 


+  ♦/])  4 

as  a  first-guess  solution  for  f.  A  first-iteration  vector,  yv  '  =  2f, 


*7  1  ) 

is  substituted  in  Eq.  (29),  where  a  first-iteration  vector  av  '  is  cal¬ 


culated.  A  superscript  denotes  the  iteration  number.  For  the  next  iter¬ 
ation  f^  is  calculated  from  a^  through  Eq.  (30),  and  y^  =  2f^ 

[for  positive  elements  in  fv  ;]  is  substituted  in  Eq.  (29)  to  solve  for 

+(21 

a  second  iterative  solution  av 

The  process  is  repeated  where  the  constraint  equation,  Eq.  (20),  is 

♦  * 

used  to  force  the  solution  f  towards  y/2  (all  elements  of  y  are  positive) 


so  that  any  negative  values  for  f(x)  that  may  appear  are  encouraged  to 


become  positive  in  successive  iterations  by  the  positivity  constraint. 

The  iteration  process  is  stopped  when  all  the  f  elements  are  positive 

> 

and  the  computed  values  of  g  from  this  solution  agree  with  the  measure¬ 
ments  for  each  y.  within  some  predetermi ned  accuracy. 

* 

The  significance  of  negative  elements  in  the  solution  f  should 
be  checked  by  setting  negative  values  to  be  zero  to  create  a  positive 
vector  f  and  then  computing  two  measurement  vectors  from  both  solu¬ 
tions.  In  cases  where  the  two  measurement  vectors  differ  by  less  than 
the  errors,  it  can  be  concluded  that  the  negative  elements  in  f  are 
insignificant. 

The  numerical  values  of  the  undetermined  Lagrange  multipliers  y^ 
and  Yp  are  chosen  for  the  iterative  process  as  follows.  When  Eq.  (29) 
is  used  only  with  the  positivity  constraint  (Yr  =  0),  y  is  chosen  for 

J  P 

the  iterative  process  such  that  y  H  <  A^A  and  such  that  the  ratio  of 

r  r 

the  largest  to  the  smallest  eigenvalues  of  the  matrix  (A  A  +  YpHp)  is 

minimal  for  yp  satisfying  the  first  condition.  This  latter  condition 

decreases  the  relative  error  magnification. 

If  Eq.  (29)  is  used  with  both  Lagrange  multipliers,  Yc,  and  Yp,  Y$ 

and  y  were  chosen  such  that  ycHc  =>  y  h  in  order  that  both  constraints 
p  S  S  'p  p 

will  affect  ATA  in  Eq.  (29),  and  the  condition  number  of  (A  A  +  + 

-w  v-  T 

Y  H  )  is  minimal  for  ycHc  +  Y„H  <  K  A. 

P  P  S  S  p  p 

Finally,  the  limits  [a,b]  in  Eq.  (1)  for  the  inversion  process 
should  be  chosen  with  care.  Normally  they  are  not  known  accurately, 
and  their  choice  can  seriously  affect  the  solution  vector.  The  upper 
limit  [b]  should  normally  be  decreased  if  the  iteration  process  cannot 
produce  both  a  positive  f  and  an  agreement  between  the  computed  measure¬ 
ments  and  the  input  measurements. 


1.3  Standard  Deviation  of  the  Solution 


Assuming  that  there  is  no  error  in  the  mathematical  model  and  in 
the  basis  functions,  the  effect  of  random  noise  in  the  measurements  on 
the  standard  deviation  of  the  solution  points  will  be  estimated. 

Following  Frieden's22  derivation  for  the  least-squares  problem, 
Eqs.  (29)  and  (30)  are  combined  to  give 


f  =  ?T  [aTA  ♦  ,1  +  Vi?]'1  [ATg  -  Yh*(;j 


(31) 


Let  the  matrix  D  be 


rT 


[Sta 


+  Yc^c  +  Y  H  1 
TS  S  'p  pJ 


-1 


(32) 


so  that  Eq.  (31)  can  be  written  as: 


fff  =  ^'g  +  Yphp(y) 


(33) 


where  the  circumflex  indicates  an  estimation.  The  error  in  g  is  modeled 


as  a  Gaussian  additive  noise  such  that  g,  =  N(g*,  a.  ),  where  g  is  the 

J  J  J  J 

th  2  s 

exact  j  measurment  and  a.  is  the  variance  about  g.  and  N  represents 

J  0 


normal  statistics.  Taking  the  expectation  of  Eq.  (33)  produces: 


D<f>  =  ATJS  +  Yphp(y)  (33A) 

Subtracting  Eq.  (33A)  from  Eq.  (33)  yields: 

A  A 

D(f  -  <f>)  =  ATu  (34) 

-b 

v/here  u  is  the  error  measurement  vector  whose  elements  obey  statistics 
2 

of  the  form  N(0,oi  ).  Decomposing  Eq.  (34)  yields: 

Evv'V  =Efluuj  l35) 

j  j 


In  a  similar  way,  Eq.  (34)  yields: 


Multiplying  Eqs.  (35)  and  (36)  and  taking  the  expectation  of  the 


product  yields  (in  matrix  notation): 


0  rJD1  =  ATAA  (37) 

where  r?  is  the  covariance  matrix  of  the  expected  solution  f  of  which 

the  elements  are  (r~)..  =  <(<f.>  -  f.)(<f.>  -  f.)>  and  A  is  a  diagonal 

■  l  j  i  l  J  J 

matrix  of  which  the  elements  are  the  expected  values  of  the  variances 

2 

of  the  measurement  errors  .  Solving  for  the  covariance  matrix 
yiel ds : 

r*  =  D”1  AT  a  at  (d_1)T  (38) 

where  the  identity  (D-1)^  =  (D^)  1  was  used. 

The  matrix  may  be  determined  from  Eq.  (38)  with  the  matrix  D 
determined  from  Eq.  (32).  The  square  root  of  the  diagonal  elements  of 


the  matrix  r"  are  the  standard  deviation  of  the  solution  points. 

A 

The  standard  deviations  of  the  expected  solutions  f  are  a  result 
of  the  mapping  of  error  from  the  measurement  space  onto  the  solution 
space.  This  mapping  depends  on  the  numerical  value  of  the  Lagrange 
multipliers  y<-  and  y  (through  the  matrix  D-1),  as  well  as  the  matrix 

O  P 

KJ  A. 


Qualitatively,  it  can  be  seen  that,  as  the  numerical  values  of 

Y,.  and  y  increase,  elements  of  the  0”1  matrix  become  smaller  so  that 
S  p 

the  standard  deviations  of  the  solution  points  decrease.  The  explana¬ 
tion  for  this  behavior  is  the  fact  that  the  model,  including  the  basis 
functions  and  the  constraint  equations,  is  assumed  to  be  correct  and 
unbiased.  Increasing  y^  and  y  decreases  the  condition  number  of  ATA 

o  p 

and  thereby  decreases  the  error  magnif ication.  However,  as  Yc  and  y 

O  p 

increase,  the  measurements  computed  from  the  solution  will  tend  to 


deviate  more  from  the  data  measurements.  The  behavior  of  the  standard 
deviation  of  f  suggests  that  it  is  better  to  use  as  large  a  numerical 
value  as  possible,  consistent  with  the  condition  stated  in  Section  1.2 
for  and  y  . 

O  P 

1.4  Basis  Functions 

The  basis  functions  are  crucial  to  the  solution  process  because 
the  solution  [Eq.  (30)]  is  a  linear  sum  of  the  basis  functions  and 
because  the  numerical  quadrature  matrix  A  [Eq.  (17)]  depends  on  the 
form  of  the  basis  functions.  Assuming  a  known  distribution  func¬ 
tion  n(x),  the  coefficients  a^  for  the  constructed  distribution  f(x) 
[Eq.  (13)]  can  be  obtained  by  using  the  orthonormal  properties  of  the 
basis  functions 

b 

3i  =  I  dx  (39) 

a 

P 

f(x)  =  £  ai*.(x)  (40) 

i=l 

Any  solution  f(x)  obtained  from  an  inversion  process  cannot  be 
closer  (in  a  least-squares  sense)  to  n(x)  (the  true  solution)  than  the 
solution  constructed  from  Eqs.  (39)  and  (40).  The  constraint  equations 
and  the  kernels  can,  at  most,  produce  a  coefficients  set  which  is  the 
one  calculated  in  Eq.  (39),  where  the  coefficients,  a^  ,  are  computed 
from  the  true  solution,  n(x). 

The  solution  f(x)  can  be  obtained  from  a  set  of  basis  functions 
constructed  from  the  kernel  functions  of  the  problem  by  a  Gram-Schmidt 
process.  These  basis  functions  will  hereafter  be  referred  to  as 


natural  basis  functions  because  they  originate  directly  from  the  prob¬ 
lem.  Another  possible  source  for  basis  functions  is  empirical  basis 
functions,  which  will  later  be  referred  to  as  "pseudo'!-empi rical  basis 
functions. 

When  using  natural  basis  functions  in  addition  to  a  smoothing  con¬ 
straint,  the  solution  f(x)  will  be  the  smoothest  solution  consistent 
with  the  measurements  which  may  be  constructed  from  the  basis  func¬ 
tions.  When  empirical  basis  functions  are  used,  the  solution  f(x) 
lies  partially  outside  the  function  space  of  the  measurements  (kernels) 
Therefore,  the  use  of  empirical  basis  functions  may  act  as  additional 
information  to  the  measurements.  Conversely,  the  use  of  empirical  basis 
functions  may  eliminate  many  possible  mathematical  solutions  for  f(x). 
However,  the  constraining  effect  of  empirical  basis  functions  can  be 
minimized  if  they  are  chosen  in  such  a  way  that  they  lie  in  the  func¬ 
tion  space  of  the  anticipated  distribution  functions.  Therefore,  in 
order  to  make  use  of  empirical  basis  functions,  some  prior  knowledge 
about  the  anticipated  distributions  must  be  obtained. 

1.4.1  Natural  Basis  Functions 

Analysis  of  the  natural  basis  functions  can  give  an  insight  into 
the  inherent  limitation  of  a  solution  obtained  from  the  inversion 
process  solely  from  the  measurements,  when  no  additional  information 
is  used.  This  analysis  is  important  in  order  to  determine  what  part 
of  the  solution  f(x)  and  its  computed  properties  is  a  result  of  the 
inversion  process  and  of  the  additional  information  used,  and  what 
part  is  a  direct  result  of  the  measurements. 

If  the  anticipated  solutions  are  functions  n(x),  then  the  type 
of  solutions  that  can  be  inferred  from  measurements  calculated  from 
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gi(y)  =  fk.(x)n(x)  dx  can  be  computed  from  Eqs.  (39)  and  (40),  when 

the  kernels  of  the  problem  are  used  to  compute  the  basis  functions 

[ i.e .,  the  set  <|k(x)].  When  an  optimal  set  of  kernels  is  used  [i.e., 

a  set  of  kernel  functions  which  have  maximum  linear  independency 

between  them],  the  reconstructed  functions  f(x)  show  the  parts  of 

n(x)  to  which  the  measurements  are  sensitive. 

The  information  about  f(x),  such  as  different  moments  of  f(x) 

( e.g .,  total  number  of  particles,  mean  radius,  total  surface  area, 

and  total  volume,  when  f(x)  is  an  aerosol  size  distribution)  can  be 

obtained  by  an  operation  of  a  weighting  function  w(x)  on  f(x),  i.e., 
b 

y"w(x)f(x)  dx.  The  weighting  functions  w(x)  for  the  quantities  listed 
a  2  3 

above  are,  respectively,  l,x,x  ,x  .  w(x)  =  x  will  result  in  a  mean 
b 

radius  if  J  f(x)dx  =  1.  The  process  of  deducing  the  various  moments 
a 

of  f(x)  from  measurements  can  be  viewed  as  an  approximation  of  the 
appropriate  weighting  function  w(x)  by  a  linear  combination  of  the 
kernels  for  which  the  measurements  have  been  made.23  Thus,  for  a 
general  weighting  function  w(x): 

m 

w(x)  s  ^  ajk.(x)  (41) 

i  =1 


J  w(x)f(x)  dx  =^aigi  .  (42) 

a  i  =1 


The  degree  of  approximation  for  w(x)  in  Eq.  (41)  will  be  defined  as 


where  e(x)  =  w(x)  -  w'(x)  and  w‘(x)  is  the  approximation  to  w(x)  using 
Eq.  (41).  If  the  approximation  of  w(x)  results  in  a  big  error  e(x), 
the  correspondi ng  moment  of  f(x)  and  the  physical  quantity  it  describes 
cannot  be  recovered  with  much  reliability. 

An  estimate  of  the  error  in  deducing  average  quantities  sue1’'  as 
average  number  density  f(x),  average  x  ,  and  average  x  in  intervals 
Ax  can  be  calculated  if  an  appropriate  w(x)  is  chosen.  For  an  average 
quantity  xn  between  xt  and  x2,  w(x)  will  be 


r 

J  x 


x!  <  x  <  x2 


w(x)  = 


el sewhere. 


1.4.2  Empirical  Basis  Functions 


In  cases  where  the  use  of  natural  basis  functions  produces  a  less 
than  satisfactory  result,  it  is  appropriate  to  search  for  a  set  of 
basis  functions  that  will  produce  better  results. 

Given  a  set  of  functions  f.(x),  i  =  1,2  •••  N,  a  set  of  basis 
functions  4>.  (x),  i  =  1,2  •••  N  can  be  constructed  such  that  any  of  the 
f..(x)  functions  can  be  obtained  by  using  a  linear  combination  of  the 
basis  function  set  such  as: 


fi(x)  2  J.  bijVx) 

j=l 


The  basis  functions  ( x )  can  be  constructed  from  the  eigenvalues  and 


»T,y 


. y-v-v-\v 

.  •  -  •.  *»-  *  -  «r.  .  x 


g.  (y)  =  J k.(x)n(x)  dx  can  be  computed  from  Eqs.  (39)  and  (40),  when 

the  kernels  of  the  problem  are  used  to  compute  the  basis  functions 

[i.e.,  the  set  <j>.  ( x ) ] .  When  an  optimal  set  of  kernels  is  used  [i.e., 

a  set  of  kernel  functions  which  have  maximum  linear  independency 

between  them],  the  reconstructed  functions  f(x)  show  the  parts  of 

n(x)  to  which  the  measurements  are  sensitive. 

The  information  about  f(x),  such  as  different  moments  of  f(x) 

(e.g.}  total  number  of  particles,  mean  radius,  total  surface  area, 

and  total  volume,  when  f(x)  is  an  aerosol  size  distribution)  can  be 

obtained  by  an  operation  of  a  weighting  function  w(x)  on  f(x),  i.e . , 
b 

J w(x)f(x)  dx.  The  weighting  functions  w(x)  for  the  quantities  listed 
3  2  3 

above  are,  respectively,  l,x,x  ,x  .  w(x)  =  x  will  result  in  a  mean 
b 

radius  if  f  f(x)dx  =  1.  The  process  of  deducing  the  various  moments 
a 

of  f(x)  from  measurements  can  be  viewed  as  an  approximation  of  the 
appropriate  weighting  function  w(x)  by  a  linear  combination  of  the 
kernels  for  which  the  measurements  have  been  made.23  Thus,  for  a 
general  weighting  function  w(x): 

m 

w(x)  =  ^  a.k.(x)  (41) 

i  =1 


J  w( x )f ( x )  dx  aigi  .  (42) 

a  i  =1 

The  degree  of  approximation  for  w(x)  in  Eq.  (41)  will  be  defined  as 
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where  e(x)  =  w(x)  -  w'(x)  and  w'(x)  is  the  approximation  to  w(x)  using 
Eq.  (41).  If  the  approximation  of  w(x)  results  in  a  big  error  e(x), 
the  correspondi ng  moment  of  f(x)  and  the  physical  quantity  it  describes 
cannot  be  recovered  with  much  reliability. 

An  estimate  of  the  error  in  deducing  average  quantities  such  as 
average  number  density  f(x),  average  x  ,  and  average  x  in  intervals 
Ax  can  be  calculated  if  an  appropriate  w(x)  is  chosen.  For  an  average 
quantity  xn  between  x1  and  x2,  w(x)  will  be 


r 

J  x 


XJ  <  X  (  x2 


w(x)  = 


el sewhere. 


1.4.2  Empirical  Basis  Functions 


In  cases  where  the  use  of  natural  basis  functions  produces  a  less 
than  satisfactory  result,  it  is  appropriate  to  search  for  a  set  of 
basis  functions  that  will  produce  better  results. 

Given  a  set  of  functions  f.(x),  i  =  1,2  •••  N,  a  set  of  basis 
functions  <{>.  (x),  i  =  1,2  •••  N  can  be  constructed  such  that  any  of  the 
f.(x)  functions  can  be  obtained  by  using  a  linear  combination  of  the 
basis  function  set  such  as: 


f  .  (x)  =  V  b .  .  4> .  ( x ) 

i  Ls  ijj 
j  =1 


The  basis  functions  ^ (x )  can  be  constructed  from  the  eigenvalues  and 


W 
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eigenvectors  of  the  covariance  matrix  C  (NxN)  (Twomey,1  pp.  139-143) 
formed  from  the  original  set  of  functions  f^(x),  such  that 

b 

Cij  =  /  ¥x>fj(x)  dx  *  («) 

a 

The  orthonormal  basis  functions  from  the  set  f^(x)  are 

♦i(x)  =  ^"1/2  F(x)  ,  (47) 

where  and  u.  are  the  i^  eigenvalue  and  eigenvector,  respectively, 

~  4- 

of  the  matrix  C,  and  F(x)  is  a  column  vector  (Nxi),  the  elements  of 
which  are  continuous  functions  f^x),  i  =  1,2  •••  N.  If  p  basis  func¬ 
tions,  where  p  _<  N,  are  used  to  represent  the  N  functions  f(x)  in 
Eq.  (45),  then  the  overall  fraction  of  the  N  functions  f(x)  accounted 
for  by  the  p  basis  functions  is 

P  /N 

2  y  2  s  •  <48> 

j=i  /  j=i 

In  principle,  there  are  two  sources  for  a  library  of  functions  f(x). 
The  first  source  might  be  from  many  measurements  of  f(x)  that  were  col¬ 
lected  over  the  years,  but  this  type  of  library  is  not  always  available. 
The  second  source  for  a  library  model  of  f(x)  functions  can  be  obtained 
by  simulating  many  functions  f(x)  according  to  theoretical  models. 

The  set  of  basis  functions  for  the  expansion  of  f(x)  in  Eq.  (30)  is 
not  unique.  For  example,  any  two  orthogonal  unit  vectors  rotated  in  an 
arbitrary  angle  to  the  x  axis  can  describe  a  vector  in  the  x-y  plane. 
Similarly,  but  in  the  function  space,  the  nonuniqueness  is  also  true 
for  the  orthogonal  empirical  basis  functions. 


v.  \v.-.‘ -.-.w 


The  criteria  for  checking  a  proposed  empirical  basis  function  set 
are:  first,  that  they  should  be  able  to  construct  as  many  types  of  the 
desired  functions  f(x)  as  possible;  second,  that  their  orientation  will 
be  such  that  the  greatest  amount  of  the  m-dimensional  function  space 
of  the  kernels  k^x),  i  =  1,2  •••  m  will  be  within  the  p-dimensional 
function  space  of  the  basis  functions  4>^{x),  i  =  1,2  •••  p;  and  third, 
that  the  basis  functions  should  be  able  to  construct  as  closely  as 
possible  a  chosen  delta  Dirac  function.  A  minimal  spread  in  the  con¬ 
struction  f(x)  should  result  from  using  basis  functions,  and  any  uncer¬ 
tainty  about  the  location  x  of  the  solution  number  density  f(x)  should 
be  mi nimal . 

Because  the  required  properties  of  the  basis  functions  are  known, 
it  is  possible  to  construct  a  library  source  from  mathematical  func¬ 
tions  f(x)  that  do  not  describe  existing  distributions  f(x)  but  can 
yield  the  desired  basis  functions.  If  the  resulting  basis  functions 
fulfill  the  criteria  stated  above,  these  basis  functions  can  be  used 
for  the  solution  of  Eq.  (1).  The  resulting  basis  functions  from 
the  mathematical  function  library  are  called  the  "pseudo"-empi rical 
orthogonal  functions. 

For  the  case  where  f(x)  is  an  aerosol  size  distribution,  the 
library  of  the  source  functions  f(x)  was  chosen  to  include  normal 
distributions,  spaced  uniformly  every  0.15  urn  between  integration 
limits  [a,b]  and  with  a  standard  deviation  of  0.2  wm,  and  few 
Junge-type  distributions  f(x)  =  x"^v+^,  where  the  v  values  are 
1,  2,  3,  4,  and  5.  The  computed  basis  functions  were  able  to  simu¬ 
late  many  anticipated  aerosol  size  distributions  with  a  very  good 
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accuracy . 


The  orientation  of  the  empirical  basis  functions  relative  to  the 
kernel  functions  determines  the  numerical  quadrature  matrix  A.  The 
components  of  the  basis  functions  which  are  nonorthogonal  to  the 
kernels  will  determine  to  what  extent  measurements  computed  directly 
from  f ( x )  [Eq.  (1)]  will  agree  with  measurements  computed  from  the 
coefficient  vector  a  and  the  matrix  A  [Eq.  (16)].  The  fraction  of 
the  basis  functions  which  is  orthogonal  to  the  kernels  determines  the 
amount  of  information  about  the  unknown  f(x)  distribution  which  cannot 
be  obtained  through  the  measurements . 

The  portion  of  the  kernel  k^(x)  for  measurement  i  which  is  within 
the  function  space  spanned  by  the  p  basis  functions  can  be  calculated 
by  Eq.  (49). 


The  portion  of  the  basis  function  j  which  is  within  the  function  space 
spanned  by  the  kernel  functions  can  be  calculated  by 


where  p..(x),  i  =  1,2  •••  m,  are  orthonormal  basis  functions  constructed 
from  the  kernel  functions  by  a  Gram-Schmidt  process. 

The  approximation  of  f(xQ)  as  a  linear  sum  of  the  basis  functions 
results  in  some  uncertainty  as  to  the  location  of  the  point  xQ  in  the 
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f(x  ).  This  uncertainty  can  be  written  as 


f(*o  1  *0)  =  Z  Vi(xo)  * 


where  &xQ  is  the  uncertainty  in  the  location  xQ  for  f(x)  at  xQ.  The 
magnitude  of  the  uncertainty  AxQ  can  be  estimated  by  substituting  a 
very  narrow  function  at  xQ  for  n(x)  in  Eq.  (39)  and  constructing  the 
approximated  f(x  )  [Eq.(51)].  The  limit  of  a  very  narrow  function  is 
a  delta  function.  In  numerical  form,  the  delta  function  may  be  approx¬ 
imated  by  a  normal  distribution  with  a  very  small  standard  deviation. 

Using  an  empirical  basis  function  is  equivalent  to  changing  the 
kernels  k.(x)  to  empirical  kernels  B^(x).  This  can  be  seen  when  the 
equations  g  =  Aa  [Eqs.  (16),  (17)]  and  a  =  ( 4>  <j>  )  <t>f  [obtained  from 
Eq.  (14)]  are  combined  to  yield: 

g  =  A(*  *T)_1  *f  (52) 

or 

9  =  Bf  (53) 

where  B  =  A( £  $ )"  <£.  If  the  B  matrix  is  computed  for  very  fine  x. 

intervals,  the  trapezoidal  rule  for  integration  is  a  good  approxima¬ 
tion,  and  Eq.  (52)  can  be  assumed  to  be  nearly  equal  to  the  integral 


form,  given  by 


9 ,•  =  f  8i(x)f(x)  dr 


where  B.(x)  are  the  rows  of  8. 

1 

The  unknown  solution  f(x)  is  a  linear  combination  of  basis  func¬ 
tions  constructed  from  the  empirical  kernel  B.(x).  As  a  result, 
similar  analysis  of  the  properties  of  the  retrieved  solution  can  be 
performed  on  B.(x),  as  was  described  for  k.(x). 


3 


1.5  Summary 

A  method  for  solving  the  inverse  problem  was  derived.  The  method 
uses  a  library  of  functions  from  which  a  set  of  orthogonal  basis  func¬ 
tions  is  computed.  The  source  of  the  library  can  be  from  a  set  of  obser¬ 
vations  or  a  set  of  mathematical  functions,  in  which  case  the  basis 
functions  are  pseudo-empirical  orthogonal  functions.  It  is  assumed  that 
any  unknown  solution  f(x)  may  be  constructed  from  a  linear  sum  of  these 
functions.  The  problem  then  becomes  one  of  solving  for  the  unknown  coef¬ 
ficients  of  the  basis  functions.  A  solution  with  a  smoothing  constraint 
and/or  a  positivity  constraint  can  be  obtained.  A  solution  with  the  pos¬ 
itivity  constraint  alone  can  be  useful  when  the  unknown  is  known  to  be 
a  narrow  function  or  an  unsmooth  function.  Analysis  of  the  information 
contained  in  the  measurements  and  the  effect  of  using  additional  infor¬ 
mation  is  given.  This  type  of  analysis  is  important  in  order  to  be  able 
to  use  the  solution  properly. 
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APPLICATION 


2.1  Introduction 

Information  about  the  aerosol  size  distribution  is  important  in 

1—3 

many  different  areas  of  the  atmospheric  sciences,  principally  due 
to  their  effect  on  optical  phenomena  and  radiative  transfer  processes. 
These  effects  depend  on  several  factors,  such  as  wavelength  of  the 
incident  radiation,  refractive  index  of  the  aerosol  material,  and  size 
distribution  of  the  aerosols. 

Given  that  a  priori  knowledge  exists  about  the  refractive  index, 
with  some  minimal  assumptions  about  the  expected  aerosol  size  distri¬ 
bution,  measurements  of  scattered  radiation  can  be  used  to  obtain 
information  about  the  aerosol  size  distribution  under  conditions  of 
independent  scattering  ( i.e .,  where  no  permanent  phase  relation  exists 
between  the  radiation  scattered  by  two  different  particles)  and  where 
the  scattered  radiation  that  undergoes  more  than  one  scattering  event 
is  negligible  {i.e.,  an  optically  thin  scattering  volume).4 

In  the  last  decade,  many  methods  for  inferring  aerosol  size  dis¬ 
tribution  from  optical  remotely  sensed  measurements  were  developed. 

They  include  spectral  extinction  measurements,5-11  aureole  and  forward 

1  ?  -  i  7 

scattering  measurements,  '  combined  scattering  and  extinction  mea- 

1819  20 

surements,  ’  angular  scattering  measurements,  and  backscattered 

21-23 

measurements.  In  all  of  the  aforementioned  methods,  a  wide  diver¬ 

gence  in  the  accuracy  claimed  may  be  observed.  A  critical  review  of 
some  of  these  methods  can  be  found  in  Deirmendjian.2** 

The  aerosol  size  distribution  inferred  from  solar  extinction  and 
solar  aureole  measurements  represents  an  average  size  distribution  for 


the  whole  atmospheric  depth.  The  aerosol  size  distribution  obtained 
from  backscatteri ng  measurements  of  a  pulsed  lidar  system  is  a  local 
property  of  a  scattering  volume  that  can  be  as  small  as  a  few  cubic 
meters  at  a  height  z.25  The  inferred  aerosol  size  distribution 
depends  on  the  assumed  refractive  indices  and  radii  limits  of  the 
aerosols  for  all  methods  of  solution,  as  well  as  the  inherent  assump¬ 
tion  that  the  particles  are  spherical  in  shape.  The  solution  obtained 
from  aureole  and  extinction  measurements  is  less  sensitive  to  the 
assumed  refractive  indices  than  the  one  from  backscattered  measure¬ 
ments.  The  refractive  indices  used  for  aureole  and  extinction  tech¬ 
niques  should  represent  some  type  of  average  refractive  index  for  the 
particulates  throughout  the  vertical  extent  of  the  aerosol  column. 

It  has  been  determined  that  the  backscattered  spectral  measure¬ 
ments  contain  more  information  about  the  particle  size  distribution 

26  4 

than  do  extinction,  aureole,  and  angular  scattering  measurements.  ’ 
This  study  investigates  the  possibility  of  inferring  aerosol  size  dis¬ 
tributions  from  simulated  backscattered  measurements,  such  as  would  be 
obtained  by  monostatic  lidar.27’25  The  results  and  features  of  the 
analyses  for  the  maximum  accuracy  in  the  inferred  solution  (assuming 
spherical  shape  and  known  refractive  indices)  can  set  an  upper  limit 
for  accuracy  on  any  solution  inferred  from  spectral  extinction,  aureole 
and  angular  scattering  measurements.  It  will  be  assumed  that  all  wave¬ 
lengths  between  0.3  um  and  10.6  um  (in  intervals  of  0.1  pm)  are  avail¬ 
able  for  measurements.  The  aerosol  is  taken  to  be  a  tropospheric , 
spherical  rural  aerosol  with  radii  limits  from  0.05  to  10.0  pm,  for 

which  the  residence  time  is  about  a  week.  The  wavelength  dependent 
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refractive  indices  are  taken  from  Kent  et  al. 


Inversions  of  backscattered  radiation  were  obtained  by  Capps  et 
al.,  in  which  the  solution  was  constructed  from  the  basis  functions 
of  the  backscatteri ng  kernel.  This  method  does  not  use  any  constraints 
on  the  solution,  and  therefore  the  solution  can  be  oscillatory  and 
negative,  even  for  the  very  narrow  radii  limits  that  were  i  sed  (0.001 
to  1.3  um).  Zuev  and  Natts22  use  an  iterative  technique  to  determine 
the  refractive  indices  and  the  size  distribution  function  from  multi¬ 
wavelength  extinction  and  backscatteri ng  cross-sections  which  are 
inferred  from  monostatic  lidar  measurements.  The  accuracy  of  their 
method  (in  the  atmospheric  boundary  layer,  z  <  1  km)  is  stated  to  be 
no  greater  than  twice  the  error  in  the  measurements  (using  ruby  and 
neodymium  and  their  second  harmonics  as  laser  sources).  Ben-David  and 
Herman23  use  an  iterative  technique  where  an  initial  guess  is  built 
into  the  kernel  function.  By  successive  iterations,  a  correction  vec¬ 
tor  is  calculated  and  a  solution  is  constructed  subject  to  a  smoothing 
constraint  in  tfa  solution. 

2  9 

In  this  work,  the  psuedo-empirical  orthogonal  function  method 
is  used  for  inferring  size  distribution  of  spherical  aerosols  with 
assumed  refractive  indices.  The  method  uses  empirical  basis  func¬ 
tions  from  which  the  solution  is  constructed  subject  to  a  constraint 
for  non-negative  solution  points  and  additionally  (optionally)  to  a 
smoothing  constraint  upon  the  solution.  The  properties,  limitations, 
and  accuracy  of  the  method  will  be  shown,  along  with  examples  of 
inversion  results  for  four  data  sets.  Possible  applications  for  the 
inferred  aerosol  size  distribution  will  be  discussed. 

2.2  The  Inverse  Problem 

In  a  typical  monostatic  lidar,  a  pulsed  laser  is  transmitted  in 

37 


a  narrow  beam,  and  a  receiver  telescope  is  co-aligned  to  collect  the 
radiation  scattered  in  the  backward  direction.  The  received  lidar 


response  may  be  described  in  terms  of  the  lidar  equation 

/  R 

Px(R,t)  =  Pxo(t)  C‘/R2  6x(R)expf-2  J  ox(r)dr 

\  0 


25 


(55) 


where  P  (R,t)  is  the  power  received  at  time,  t,  from  distance,  R,  and 

A 

P  (t)  is  the  transmitted  power,  and  C'  is  the  instrumental  calibra- 

A 

tion  factor.  The  term  0.(R)  is  the  volume  backscatteri ng  cross- 

A 

section,  and  a  (R)  is  the  volume  extinction  coefficient,  both  at  wave- 

A 

length  X  and  range  R.  Assuming  the  measurements  of  P  (R,T)  are  made 
at  optically  thin  ranges,  R,  we  may  neglect  the  attenuation  term  in 
Eq.  (55).  Then  . 


P x( R »t )  =  Cex(R)  =  C  J  KA(r,m)f(r)dr  =  g; 


(56) 


where 


C  =  C  /R  x  PXQ(t)  . 


K  (r,m)  is  the  particle  backscatteri ng  cross-section  for  radius  r  and, 

A 

with  refractive  index,  m  (to  simplify  the  notation,  the  dependence  of 
the  kernel  on  the  refractive  index  will  be  omitted),  and  f(r)  is  the 
number  of  particles  of  radius  r  per  unit  volume  per  unit  interval  in 
r.  The  measured  backscattered  flux  P  (R,t)  at  wavelength  x  will  be 

A 

referred  to  as  gx  to  put  the  notation  into  the  more  usual  form.  If 
Eq.  (56)  is  written  in  numerical  form,  it  becomes 

g  =  A  f  (57) 

► 

where  g  is  an  mxl  column  vector  whose  elements  are  the  backscattered 
flux  at  at  m  wavelengths,  A  is  an  mxn  matrix  composed  of  the  particle 


backscatteri ng  cross-sections  for  the  various  wavelengths  and  radii 

intervals  and  also  contains  any  numerical  quadrature  required  in  addi- 

*)■ 

tion  to  the  constant  C,  and  f  is  the  unknown  n*l  column  vector  whose 
elements  are  the  number  densities  at  the  n  discrete  radii.  For  the 
remainder  of  this  work,  we  assume  that  the  measurements,  g  ,  are  given 
and  examine  the  feasibility  and  accuracies  obtainable  in  inverting  the 
measurements  to  obtain  the  unknown  f(r).  In  addition,  we  introduce  a 
new  inversion  approach  through  the  use  of  pseudo-empirical  orthogonal 
functions  to  describe  the  unknown  f(r)  and  also  introduce  a  positivity 
constraint  which  helps  insure  that  the  values  of  f(r)  so  obtained  are 


not  physically  unreal  negative  numbers. 

To  see  the  difficulties  of  solving  Eq.  (56)  or  its  numerical  equiv¬ 
alent,  Eq.  (57),  it  should  be  noted  that  the  measurements  g  are  actu- 

A 

ally  equal  to  inner  dot  products  in  the  function  space  (Hilbert  space) 


between  the  kernels  k^(r)  and  the  unknown  function  f(r)  in  Eq.  (56)  or 

equally  dot  products  of  the  row  vectors  of  the  A  matrix,  A^  (j)  with 

the  unknown  column  vector  f  (i.e.,  g.  =  5]  A. .f .). 

i  j  ij  J 

The  above  geometrical  viewpoint  of  Eqs.  (56)  and  (57)  addresses 


the  inverse  problem  thusly:  given  m  projections,  g^,  of  an  unknown 
function  (vector)  f(r)  on  some  set  of  m  skew  functions  (vectors)  k^(r) 
to  construct  the  unknown  function  (vector)  f(r).  As  as  result  of  the 
erratic  fine  structure  of  the  kernel  (Fig.  2),  a  large  number  of  solu¬ 


tion  points  n  must  be  taken  so  that  the  integral  in  Eq.  (56)  can  be 
evaluated.  Hence,  there  are  in  equations  and  n  unknowns  where  n  may  be 
larger  than  m  (n  is  usually  on  the  order  of  50,  and  m  is  on  the  order 


•  • 


2.3  The  Information  Available  in  the  Measurements 
2.3.1  Independence  of  the  Kernel  Functions 

An  analysis  of  the  kernels  k^(r)  of  the  inverse  problem  can  give 

an  insight  into  the  information  available  in  the  measurements  about 

the  unknown  aerosol  size  distribution.  This  analysis  utilizes  results 

2  9 

from  the  previous  work  by  the  authors. 

Recent  developments  in  laser  technology  make  available  a  wide 
range  of  wavelengths.25  In  order  to  examine  the  theoretically  maximum 
information  content  possible,  we  assume  all  wavelengths  between  0.3  urn 
and  10.6  urn  in  intervals  of  0.1  urn  may  be  used. 

One  hundered  and  four  wavelengths  between  0.3  urn  to  10.6  urn  were 
thus  selected  and  the  backscattering  cross-sections  computed  for  each 
wavelength  as  a  function  of  size  and  refractive  index  of  the  rural 
tropospheric  aerosol.  These  functions  were  used  as  kernels  k^r), 
i  =  1,2, ••  *104,  and  were  arranged  in  order  of  maximum  independency 
between  them.  The  independency  within  the  kernel  functions  was  mea¬ 
sured  as  the  maximum  orthogonal ity  between  kernel  i  and  all  the  other 
kernels  j  *  i . 

30  31 

The  interdependence  between  the  kernels  ’  k^(r),  i  =  l,2,»»*m, 

can  be  demonstrated  as  follows.  In  principle  and  from  a  purely  mathe¬ 
matical  point  of  view,  two  or  more  of  the  kernels  are  linearly  depen¬ 
dent  if  there  is  a  set  of  coefficients  a.,  i  =  1,2, •••m,  such  that 

m 

£  oc.k.(r)  =  0  (58) 

and 


in  order  to  eliminate  the  trivial  case. 


In  a  real  physical  situation,  there  are  uncertainties  in  the  mea¬ 


sured  quantities  or  in  the  mathematical  model.  In  this  situation,  a 


linear  combination  of  the  kernels  that  results  in  the  right-hand  side 


of  Eq.  (58)  equalling  some  value  e,  e  >  0  but  less  than  the  uncertain¬ 


ties  involved,  is  no  better  than  a  zero  value  and  is  equivalent  to 


linear  dependency  between  two  or  more  of  the  kernels. 


If  it  is  assumed  that  the  uncertainty  in  each  wavelength  i  is  e., 


Eq.  (58)  can  be  written  as: 


aiki(r)  +  «iei(r)  =  q 


or  in  vector  notation 


,  .+T  +T .  -*■ 

(k  +  e  )a  =  q 


+T-  . 

a  a  =  1 


-r 

where  the  k  vector  elements  are  k..(r),  the  a  vector  elements  are  the 


coefficients  a.,  and  the  e  vector  elements  are  e.(r)  that  give  the 
i  i 


uncertainty  in  the  measurement  or  represent  uncertainties  in  the 
physical  model  (single  scattering  approximation,  refractive  index 


uncertainties,  and  so  forth),  and  the  superscript  T  denotes  a  trans¬ 


pose  operation. 


The  quantity  to  be  minimized  is 


Hi  +T/t+  >WI»I  *i,> 
=  a  (k  +  e) (k  +  e  ) a 


T  .  tT, 


subject  to  the  constraint  a  a  =  1. 


i* 


ij 


If  the  expected  value  of  <|q|>  (where  <|*|>  denotes  an  averaging 


process)  is  taken  and  a  Gaussian  additive  noise  for  ^  is  assumed, 


such  that 


e.  =  N[0,  P?k?(r)]  , 


where  N  represents  a  normal  statistic  about  zero  mean  and  a  variance 
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P-jk^(r),  P.j  is  the  fraction  error  in  k^(r)  and  <k*e>  =  0,  Eq.  (60) 


yields: 


,i  1 2.  +T.+.+T+  +T  + 

<  | q  I  >  =  a  kk  a  +  a  Aa 


where  A  is  a  diagonal  matrix  whose  elements  are 

b 


ah  =/p?k(<r> dr- 


2 

Eq.  (61)  can  be  written  symbolically  as  <jq  |>  =  q  +  q2.  The  minimum 

of  q,  is  the  smallest  eigenvalue,  A  ,  of  the  matrix  C  whose  elements 
1  min 


U 


(r)  k .(r)  dr, 


0 

/k'J 


.  (r)  dr  =  1 


(Twomey*  ,  p.  189) 


q2  .1  P* 


Finally,  Eq.  (61)  yields 


o  ~  111  o  o 

<  I  q  |  > 2  =  A  .  (of  C)  +  y  of  pf. 

min  x  i  l 


The  maximum  value  of  q  can  be  calculated  from  Schwarz's  inequality: 


>  .N  , 


43 


a.  *4  a. 


,  ||.  ih'iUl*!  tl|»  |U  i>«  l^  ll^L* 


T  a'  Pz  <  Y  a f  T  P 
i =1  11  i  =1  1  i  =1  1 


2  =  mP2 


(if  we  assume  all  P.  =  P).  Hence,  if  A  .  >  mP  ,  the  m  kernels  are 

v  i  '  mi n 

independent.  To  ensure  a  signal-to-noise  ratio  of  10,  the  condition 


for  independence  of  m  kernels  which  contain  an  error  of  magnitude  P 


will  be  set  at 


A  .  =  10  mP  . 

min 


The  covariance  matrix  C  was  computed  for  various  numbers  of  wave¬ 


lengths.  The  covariance  matrix'  eigenvalues  were  analyzed  [Eq.  (62)] 


to  obtain  the  number  of  wavelengths  that  yield  independent  measurements 


with  some  predetermined  measurement  accuracy.  Results  are  shown  in 


Table  1. 


Table  1.  Number  of  independent  wavelengths 


(measurements)  and  minimum  accuracy  needed 


in  the  measurements. 


Number  of  Wavelengths 


Accuracy  [%] 


2.3.2  Information  of  Unknown  Size  Distribution  Contained  in  the  Kernels 

Direct  measurements  of  aerosol  size  distributions  show  that  the 
aerosol  size  distribution,  n(r),  can  be  approximately  described  by  a 
log  normal  distribution  ‘  and  a  power  law,  n(r)  =  cr  size  distri¬ 
bution.  36-39 

These  distributions  are  recoverable  from  the  measurements  only 
if  they  are  within  the  function  space  spanned  by  the  kernels  [i.e., 

Eq.  (56)  or  Eq.  (57)].  To  examine  the  extent  to  which  expected  size 
distributions,  n(r),  lie  within  the  function  space  of  the  kernels, 
the  following  procedure  was  followed.  A  set  of  orthonormal  basis 
functions  (referred  to  as  "natural"  basis  functions  in  the  previous 
work)  was  constructed  from  the  set  of  kernels.  Expected  size  distri¬ 
bution  functions,  n(r),  were  then  constructed  from  combinations  of 
the  basis  functions.  These  reconstructed  distribution  functions, 
f(r),  were  then  compared  to  the  original  n(r)  to  determine  the  degree 
to  which  the  f(r)  lie  within  the  space  of  the  kernels. 

Figures  3a-f  show  results  of  log-normal  functions,  n(r),  and 
the  constructed  functions,  f(r),  calculated  by  using  40  natural  basis 
functions.  The  parameters  for  the  log-normal  n(r)  (standard  variation 
and  mean  radius)  are  shown  in  the  figures. 

Examination  of  these  figures  shows  that,  while  some  size  dis¬ 
tribution  functions,  n(r),  are  reproduced  reasonably  well  by  the  basis 
functions  ( i.e Fig.  3c,d,e,f),  others  are  very  poorly  reconstructed 
[i.e.,  Figs.  4a,b,c,d,  and  Fig.  3a).  Thus,  it  is  evident  that  some 
size  distributions,  most  notably  power  law  types  (Figs.  4a-d)  and 
very  sharply  peaked  and  narrow  log-normal  types  (Fig.  3a)  lie  prima¬ 
rily  outside  of  the  function  space  of  the  kernels  {i.e.,  they  possess 
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Mgure  3.  Reconstruction  of  size  distribution  f(r)  from  log-normal 
size  distribution  n(r),  using  40  natural  basis  functions 
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Figure  3  (continued). 
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Figure  4.  Reconstruction  of  size  distribution  f(r)  from  power-law 
size  distribution  n(r),  using  40  natural  basis  functions 
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large  components  which  are  orthogonal  to  the  kernels)  and  therefore 
measurements  between  0.3  and  10.6  urn  contain  limited  information  as 
to  their  form. 

An  estimate  of  error  in  deducing  average  number  density  in  radii 
interval  Ar  =  0.5  urn  can  be  calculated  by  approximating  a  weighting 
function  w(r)29,  Eq*  44  from  the  40  kernel  functions.  This  error 
was  calculated  to  be  80%. 

2.4  Additional  Information  Contained  in  the  Pseudo-Empirical 
Orthogonal  Functions 

A  brief  review  of  the  method  of  solution  using  pseudo-empirical 
29 

functions  will  be  given  before  examining  the  pseudo-empirical  orthog¬ 
onal  basis  functions  used  in  this  work  and  presenting  results  of  the 
inversion  process. 

In  the  following  work,  basis  functions  constructed  from  a  matrix 
whose  elements  are  composed  of  a  set  of  mathematical  functions  (normal 
and  power  law  functions)  are  employed,  as  opposed  to  the  natural  basis 
functions  constructed  from  the  kernels  as  used  in  the  previous  section. 
These  basis  functions  are  referred  to  as  pseudo-orthogonal  basis  func¬ 
tions. 

Assuming  that  the  unknown  solution  f(r)  can  be  constructed  from 

coefficients  a.  and  basis  functions  <t>.(r),  Eq.  (56)  can  be  written: 

J  J 

gi  =  /ki^r)  ?  aj  <t’j(r)dr» 

or,  in  matrix  notation: 

g  =  Aa  (63) 

where  A  is  a  matrix  whose  elements  are: 


Aij  ■  /ki<r>  ♦o(r>dr- 


(64) 
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and  a  is  the  unknown  coefficient  vector  from  which  the  solution  f(r) 


is  constructed,  i.e., 

f(r)  =  Y  a.  4>.(r)  (65) 

j  j  j 

+ 

and  g  is  the  measurements  vector. 


A  direct  solution  of  Eq.  (63)  is  almost  always  unstable.  For  the 
present  method,  two  types  of  constraints  on  the  solution  are  employed 
in  order  to  solve  Eq.  (63).  The  first  constraint  is  a  smoothing  con- 
straint,  such  that: 

32f(r  ) 

- - —  =  min  ,  (66) 


as  the  solution  is  usually  expected  to  be  smooth  and  to  filter  out 
artificial  oscillations  in  the  solution.  The  second  constraint  is  a 


"positivity"  constraint,  such  that  f(r^)  >  0  for  any  r. ,  as  all  so’u- 
tion  points  must  physically  be  positive. 

The  positivity  constraint  is  employed  in  an  iterative  manner.  An 
initial  first-guess  distribution,  y(r),  that  is  positive  for  all  r  is 
used  to  start  the  procedure.  The  expression 


7(0  0) 

q  =  f2(r)  -  y(r)  f(r)  (67) 

(i) 

where  f(r)  is  the  first  iterative  solution,  is  then  minimized  as  the 
positivity  constraint.  For  any  value  of  y(r),  the  minimum  value  of  q 
is  -y(r)/4,  for  which  f(r)  =  y(r)/2,  a  positive  number.  This  con¬ 
straint  tends  to  force  the  solution  toward  y/2  for  the  first  iteration. 


The  degree  of  forcing  depends  on  the  strength  given  to  the  constraint. 

(0 

For  the  next  iteration,  y(r)  is  set  to  be  equal  to  2  f(r),  and  the 


I 


process  is  repeated.  Any  negative  values  of  f(r)  which  nay  appear  are 
encouraged  to  become  positive  in  successive  iterations. 


Using  the  method  of  Lagrange  multipliers,  the  final  solution  employ¬ 


ing  both  constraints  is  given  by: 

a  =  [A^A  +  y  H  +  y  H  ]  1  [A^g  +  y  h  (y)] 
l  's  s  p  p  a  'p  pw ' 

and 


Hr)  =  I  ai 4>i (r) 


(68) 

(69) 


(69a; 


where  H  and  H  are  constraint  matrices  arising  from  the  minimization 
p  s  a 

+  ■> 

criteria,  h  (y)  is  a  positivity  constraint  vector  which  is  a  func- 
tion  of  y,  y$  and  Yp  are  the  Lagrange  multipliers  which  determine  the 
strength  of  the  smoothing  and  positivity  constraints,  and  £  is  a  matrix 
whose  rows  are  ^(r).  T  denotes  a  transpose  operation,  as  before. 

The  standard  deviation  of  the  solution  is  the  square  root  of  the 
diagonal  elements  of  the  matrix  r.  ,  where 


r.  =  ff-i  ffTAAT(D-i)T 


(69b) 


and 


D-i  =  *T[aTA  +  ysH$  -  ^p]"1 


and  A  is  a  diagonal  matrix  whose  elements  are  the  expected  values  of 

the  variances  of  the  measurements  errors. 

The  basis  functions  <Mr)  can  be  constructed  from  the  eigenvalues 

J 

-w  3  2 

and  eigenvectors  of  the  covariance  matrix  C  (N*N)  (Twomey,  pp.  139- 
143)  formed  from  the  original  set  of  functions  f^(r),  such  that 

b 

Cij  =  f  Vr^  V  r)dr  (70) 

a 
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The  orthonormal  basis  functions  from  the  set  f. (r)  are  given  by 

4>-j  ( r )  =  X^172  S}  F(r)  (71) 

where  A.  and  u_j  are  the  1  eigenvalue  and  eigenvector,  respectively, 
of  the  matrix  C,  and  F(r)  is  a  column  vector  (Nxi)  the  elements  of 
which  are  continuous  functions  f^(r),  i  =  1,2»**N. 

Figures  3  and  4  showed  that  the  use  of  natural  basis  functions 
produces  a  less  than  satisfactory  result.  Therefore,  it  is  logical 
to  search  for  a  set  of  basis  functions  that  will  produce  better 
results. 

The  criteria  for  checking  a  proposed  empirical  basis  function  set 
are:  first,  that  they  should  be  able  to  construct  as  many  types  of 
aerosol  size  distribution  functions  as  possible;  second,  that  their 
orientation  will  be  such  that  the  greatest  amount  of  the  m-dimensional 
function  space  of  the  kernels  k.(r),  i  =  l,2***m  will  be  within  the 
p-dimensional  function  space  of  the  basis  functions  <^(r),  i  =  1,2* **p 
and  thirdly,  the  basis  functions  should  be  able  to  closely  approximate 
a  chosen  delta  Dirac  function,  in  order  that  a  minimal  spread  in  the 
construction  f(r)  results  from  using  the  basis  functions,  and  any 
uncertainty  about  the  location  r  of  the  solution  number  density  f(r) 
be  mi  nnna  I  . 

It  is  standard  procedure  to  compute  basis  functions  from  a  large 
library  of  actual,  measured  functions.  However,  in  the  present  case, 
since  an  adequate  library  of  measured  aerosol  size  distributions  does 
not  exist,  we  construct  a  "library"  based  upon  expected  forms  of  the 
unknown  distributions,  t.e.,  the  f(r)  functions  -  hence,  the  name 
"pseudo"-empirical  orthogonal  functions. 


L,>  ytt  .»|  <  ■ltVLi*tj,L*  L»WL>,>J‘i 


The  library  of  the  source  functions  f(r)  was  chosen  to  include 
68  functions  (Figure  5).  Sixty-three  functions  were  very  narrow 
normal  distributions,  spaced  uniformly  every  0.15  ym  between  0.2  urn 
and  9.5  ym  and  with  a  standard  deviation  of  0.2  ym.  The  remaining 
5  functions  were  chosen  to  be  Junge-type  distributions  f(r)  =  r~^v+^, 
where  the  v  values  are  1,  2,  3,  4,  and  5.  All  68  functions  were  cal¬ 
culated  for  a  radius  range  between  0.05  ym  and  10  ym.  Although  the 
first  30  basis  functions  can  account  for  99%  of  the  overall  variation 
of  the  68  functions,  the  errors  resulting  in  the  measurements  computed 
from  the  constructed  Junge-type  distributions  and  the  measurements 
computed  from  the  actual  Junge  distributions  were  of  several  orders  of 
magnitude.  Most  of  the  information  about  Junge-type  distribution  is 
contained  in  the  last  few  basis  functions.  Therefore,  all  68  basis 
functions  computed  from  Eq.  (71)  are  used. 

In  principle,  a  larger  set  of  source  functions  will  result  in  a 
better  quality  of  basis  functions  according  to  the  specified  proper¬ 
ties  mentioned  above.  This  source  set  of  equations  will  work  if  the 
normal  distribution  functions  are  as  narrow  as  possible  and  if  there 
is  an  overlap  between  the  functions  so  that  the  resulting  basis  func¬ 
tions  will  be  continuous  functions.  However,  as  the  set  of  normal 
source  functions  becomes  more  numerous,  more  basis  functions  are 
required  in  order  to  approximate  Junge-type  distributions  and  their 
measurements.  Hence,  the  dimension  of  the  A  A  matrix  will  be  bigger 
and  the  computation  time  needed  for  the  iterative  process  will 
increase  considerably.  Furthermore,  the  larger  the  set  of  source 
functions,  the  smaller  the  smallest  eigenvalues  become,  resulting 
in  poor  accuracy  in  computing  the  basis  functions  in  Eq.  (71). 
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Therefore,  an  optimal  number  of  functions  must  be  determined  which 
is  as  large  as  possible  and  still  allows  accurate  determination  of 
the  smallest  eigenvalues.  Based  on  the  factors  heretofore  described, 
the  basis  functions  chosen  represent  an  optimized  set. 

Figure  6a-e  presents  results  of  reconstructed  aerosol  size  dis¬ 
tributions  f(r)  from  various  aerosol  size  distribution  models  n(r)41 
that  differed  from  the  original  source  functions.  The  reconstructed 


distributions  were  calculated  from  the  pseudo-empirical  orthogonal 
functions  from  the  equation 

P 

f(r)  =  £  a1*i(r)  (72) 

i=l 


where 


n(r)<(>i(r)dr 


(73) 


The  solid  curves  are  the  analytic  models,  n(r),  and  the  symbols  repre¬ 
sent  the  reconstructed  f(r).  It  can  be  seen  that,  in  most  cases,  the 
symbols  fall  on  the  solid  curves,  which  is  to  say  that  f(r)  =  n(r). 

Figure  6a  represents  reconstruction  of  various  log-normal  distri¬ 
butions  given  by 

n(r)  =  1/(2tt)  ^  exp[-l/2  (In  r  -  In  r)  /ln2o]  (74) 


which  is  believed  to  represent  the  size  distribution  function  for 

3  3 

aerosols  having  soil -derived  components.  Figure  6b  shows  recon¬ 
structions  of  various  Junge-type  distributions  given  by 

n(r)  =  r"^1)  (75) 


which  was  proposed  by  Junge  to  represent  continental  aerosol .  The 
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Figure  6.  Reconstruction  of  size  distribution  f(r)  from  aero¬ 
sol  size  distribution  n(r),  using  the  "pseudo"-empi r- 
ical  basis  functions  (a)  for  log-normal  distribution, 
(b)  for  power-law  distribution,  (c)  for  regularized 
power-law  distribution,  (d)  for  modified  Gamma  distri¬ 
bution  and  (e)  for  inverse  modified  Gamma  distribution 
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Figure  6  (continued), 
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numerical  values  for  v  are  different  from  the  numerical  values  in  the 
source  function  set  (Figure  5).  An  inspection  of  Figure  6  reveals  that 
when  n(r)  changes  more  than  7  orders  of  magnitude,  the  constructed 
distribution  f(r)  (symbols)  deviates  from  n(r)  (solid  line).  However, 
measurements  computed  from  the  two  distributions  agree  within  5%.  In 
cases  where  a  significant  contribution  to  the  measurements  is  from 
number  densities  with  a  dynamic  range  of  more  than  seven  orders  of 
magnitude,  the  computed  measurements  from  f(r)  will  deviate  signifi¬ 
cantly  from  measurements  computed  from  n(r). 

In  Figure  6c,  n(r)  is  a  regularized  power  law  distribution  given 

n(r)  =  Pj/p2  [(r/p2)P3  1]/[1  +  (r/p2)P3]P,+  (76) 

where  px  through  p4  are  constants.  This  distribution  is  similar  to 
the  Junge-type  distribution  at  large  radii,  but  does  not  have  a  singu¬ 
larity  at  r  =  0. 

In  Figure  6d,  n(r)  is  the  so-called  modified  Gamma  distribution. 


given  by 


n(r)  =  pirP2exp(-p3rPlt) 


where  the  constants  p1  through  p4  determine  various  models  of  aero¬ 
sols  such  as  Haze  H,  Haze  M,  Haze  L,  and  cloud  C.l  to  C.3  used  by 
Deirmendjian.  Because  the  Haze  models  (L  and  H)  range  to  more  than 
40  orders  of  magnitude  for  the  radii  range  0.05-10.0  urn,  only  the 
cloud  C.l  to  C.3  are  presented. 

In  Figure  6e,  n(r)  is  the  so-called  inverse  modified  Gamma  distri¬ 


bution,  given  by 


n(r)  =  p1exp(-p3/rP‘*)/rp2 


which  describes  exponential  fall -off  at  small  radii  and  power  law 
behavior  at  large  radii.  This  type  of  distribution  can  represent  dry 
aerosols. 

The  fraction  of  the  basis  functions  which  is  orthogonal  to  the 
backscattering  kernels  determines  the  amount  of  information  about  the 
unknown  aerosol  size  distribution  which  cannot  be  obtained  through  the 
measurements. 

The  portion  of  the  kernel  k..(r)  for  wavelength  x^  which  is  within 
the  function  space  spanned  by  the  p  basis  functions  can  be  calculated 
by  Eq.  (79): 


The  portion  of  the  basis  function  j  which  is  within  the  function  space 
spanned  by  the  backscattering  kernel  functions  can  be  calculated  by 
Eq.  (80).: 


where  p..(r)  =  1.2***m  are  orthonormal  basis  functions  constructed  from 
the  kernel  functions. 

Figure  7a  shows  the  portion  of  k^(r)  which  is  within  the  function 
space  of  the  "pseudo"-empi rical  basis  functions.  It  can  be  seen  that. 


ve  to  the  kernels,  (a)  Portion  of  Kx  within  the  empirical  basis 
notions,  and  (b)  Portion  of  within  the  kernel  functions. 


on  the  average,  no  more  than  10%  of  the  kernel  k^(r)  is  orthogonal  to 
the  basis  functions.  Figure  7b  shows  that  about  50%  of  the  function 
space  of  the  basis  functions  is  outside  the  kernel  function  space. 
Thus,  the  basis  functions  contain  most  of  the  information  within  the 
kernels  and  also  add  additional  information  about  the  anticipated 
aerosol  size  distributions. 

The  approximation  of  f(rQ)  as  a  linear  sum  of  the  basis  functions 
results  in  some  uncertainty  as  to  the  location  of  the  radii  rQ  in  the 
number  density  f(rQ).  This  uncertainty  can  be  written  as 

P 

HrQ  +  Ar0)  =  (81) 

where  ArQ  is  the  uncertainty  in  the  location  rQ  for  the  number  density 
at  rQ.  The  magnitude  of  the  uncertainty  ArQ  can  be  estimated  by  sub¬ 
stituting  a  very  narrow  function  at  rQ  for  n(r)  in  Eq.  (72)  and  con¬ 
structing  a  number  density  function  [Eq.  (81)].  The  limit  of  a  very 
narrow  function  is  a  delta  function.  In  numerical  form,  the  delta 
function  will  be  approximated  by  a  normal  distribution  with  a  standard 
deviation  of  0.01  un. 

The  results  of  constructing  10  functions  [Eq.  (81)]  from  10  delta 
functions  centered  at  various  locations  rQ  are  given  in  Figure  8, 
where  it  is  shown  that  the  uncertainty  Arfl  is  about  0.25  urn,  and  the 
locations  rQ  are  exactly  at  the  locations  of  the  original  delta  func¬ 
tions.  Figure  8a  shows  one  of  the  constructed  delta  functions  of 
Figure  8. 

In  order  to  estimate  the  error  in  deducing  average  number  density 
in  radii  interval  Ar=0.5  un,  a  weighting  function  w(r)  was  constructed 
from  the  kernels  which  contain  the  additional  information  from  the 
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Figure  8a.  Enlargement  of  approximation  of  delta  function  6  of 
Fig.  8  using  the  "pseudo"-empirical  basis  functions 


This  error  was  calcu- 


pseudo-empi rical  functions  [B.(x)].29,  Eq’  54 
lated  to  be  50%.  In  Section  2.5,  it  will  be  shown  (Fig.  9)  that  this 
estimate  is  a  realistic  estimate,  and  results  of  inversions  performed 
with  pseudo-empirical  basis  functions  will  be  reported. 

2.5  Results 

Simulated  measurements  of  backscattered  radiation  and  refractive 
indices  were  provided  for  15  wavelengths  (between  0.2  urn  and  9  urn)  and 
for  four  aerosol  size  distributions  (denoted  by  data  sets  G,  H,  I,  and 
J).  (Or.  J.  Bottinger,  private  communication)  These  measurements  were 
perturbed  with  various  random  errors  which  were  normally  distributed 
with  a  standard  deviation  of  10%.  In  this  section,  inversions  for 
these  four  data  sets  will  be  performed  along  with  analyses  of  the 
inversion  results. 

Figures  9a-d  show  the  results  of  the  inversion  for  data  sets  G, 
H,  I,  and  J.  In  these  figures,  pseudo-empirical  orthogonal  functions 
were  used  to  obtain  two  solutions  for  each  data  set.  The  first  solu¬ 
tion  was  obtained  using  the  positivity  constraint  only  (y<.  =  0).  This 
solution  is  denoted  SOl.l.  For  the  second  solution  (SOL. 2),  both  the 
positivity  and  the  smoothing  constraints  were  used.  The  dashed  line 
(no  symbols)  represents  the  true  solution  for  each  of  the  four  data 
sets.  Figure  10  shows  the  provided  simulated  measurements  for  data  G 
and  the  computed  measurements  from  S0L.1  and  SOL. 2  from  Figure  9a, 
all  of  which  are  a  function  of  wavelength.  The  other  solutions  shown 
in  Figure  9b-d  reproduced  the  simulated  measurements  equally  well 
and  are  not  shown  here.  Figure  11  shows  the  iterative  process  used  in 
solving  Eq.  (68)  for  data  set  I.  The  residual  error  (the  difference 
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Figure  11.  Residual  error  and  number  of  negative  solu¬ 
tion  points  vs.  iteration  number  during  the 
iteration  process  for  inversion  of  data  "1". 
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between  the  simulated  measurements  and  the  computed  measurements  from 
+■ 

the  solution  f)  and  the  number  of  negative  solution  points  converge 
toward  zero  as  the  iteration  number  increases.  The  residual  error  is 
considered  to  be  zero  when  the  computed  measurements  are  within  10%  of 
the  simulated  measurements.  (The  simulated  measurements  contained  10% 
random  errors.) 

An  inspection  of  the  solutions  presented  in  Figure  9  shows  that 
S0L.1  differs  greatly  from  SOL. 2,  but  both  can  represent  a  real  aerosol 
size  distribution  (i.e.,  no  negative  aerosols),  and  both  reproduce  the 
simulated  measurements  equally  well.  SOL. 2  contains  a  smoothing  con¬ 
straint  on  the  unknown  solution.  An  inspection  of  SOL. 2  with  reference 
to  the  true  solution  in  Figure  9  reveals  the  assumption  of  smoothness. 
In  Figure  9a,  SOL. 2  describes  very  well  the  true  solution,  which  is 
smooth.  In  Figure  9b,  the  smoothing  constraint  forces  SOL. 2  to  be 
much  wider  than  the  true  solution.  In  Figure  9c,  it  may  be  noted 
that,  for  radii  less  than  1.5  urn,  the  smoothing  constraint  does  not 
allow  SOL. 2  to  reproduce  the  narrow  peak  at  1.0  um.  For  radii  larger 
than  1.5  wt\,  the  true  solution  is  nearly  a  straight  line  for  which  the 
second  derivative  is  zero.  In  this  part  of  the  curve,  the  assumption 
of  smoothness  is  accurate  and  is  reflected  in  SOL. 2.  S0L.1,  which  was 

obtained  by  assuming  only  that  f(r)  >  0,  preserves  the  width  of  the 
true  solutions,  as  can  be  seen  in  Figures  9b-d.  This  type  of  compar¬ 
ison  between  the  inversion  solution  and  the  true  solution  is  possible 
only  for  simulated  measurements.  If  both  solutions,  S0L.1  and  SOL. 2, 
were  obtained  from  real  measurements,  both  would  be  equally  plausible. 

The  aerosol  size  distribution  is  often  an  input  parameter  in 
radiative  transfer  models  and  climate  models  and  in  other  models  for 


atmospheric  physics.  The  optical  properties  which  are  of  interest  for 
these  models  are:  total  scattering,  absorption  and  extinction  cross 
sections,  single  scattering  albedo,  and  phase  function  of  the  aerosol 
size  distribution.  Figure  12a  shows  the  ratio  of  computed  optical 
properties  from  S0L.1  and  SOL. 2  for  data  G  as  a  function  of  wavelength. 
From  the  figure,  it  can  be  seen  that  the  difference  between  the  com¬ 
puted  optical  properties  from  S0L.1  and  SOL. 2  (in  Fig.  9a)  is  less  than 
10%.  Figure  12b  shows  the  ratio  of  computed  phase  function  for  three 
wavelengths  from  S0L.1  and  SOL. 2  for  data  G.  The  phase  functions  com¬ 
puted  from  S0L.1  and  SOL. 2  differ  by  less  than  20%.  Results  of  compu¬ 
tation  of  ratios  for  the  same  optical  properties  for  solutions  for  data 
sets  H,  I,  and  J  are  similar  to  Figures  12a  and  12b  and  are  not  shown. 

Physical  properties  such  as  average  radius,  total  mass,  total 
area,  and  average  number  density  were  computed  for  all  solutions  pre¬ 
sented  in  Figures  9a-d.  The  ratio  of  computed  physical  properties  from 
S0L.1  and  SOL. 2  for  each  data  set  is  shown  in  Figure  13.  The  differ¬ 
ence  between  computed  physical  properties  for  S0L.1  and  SOL. 2  is  less 
than  20%.  Therefore,  either  solution  (S0L.1  or  SOL. 2)  can  be  used 
equally  well  as  an  input  parameter  for  calculating  integrated  optical 
and  physical  properties  of  the  aerosol  size  distribution. 

2.6  Summary 

The  pseudo-empirical  orthogonal  method  of  solution29  was  applied 
for  determining  aerosol  size  distributions  from  backscattered  measure¬ 
ments.  Two  types  of  constraints  were  employed:  a  positivity  constraint 
alone  and  a  combination  of  positivity  and  smoothing  constraints.  The 
positivity  constraint  can  be  useful  when  the  aerosol  size  distribution 
is  known  to  be  a  narrow  distribution  or  an  unsmooth  function.  The  use 


scattering  albedo,  and  (b)  phase  functions  for  three  wavelengt 
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of  the  positivity  constraint  allows  for  a  solution  of  Eq.  (68)  by  an 
iterative  process  which  converges  very  quickly  (after  40  iterations). 

The  basis  functions  constructed  from  the  "pseudo"- empirical 
functions  can  reproduce  many  types  of  aerosol  distributions  and  the 
resulting  measurements  with  good  accuracy.  The  use  of  these  basis 
functions  does  not  overly  restrict  the  variety  of  possible  physical 
aerosol  size  distribution  solutions.  However,  the  use  of  empirical 
basis  functions  is  a  constraint  on  the  solution,  inasmuch  as  it  makes 
it  possible  to  obtain  aerosol  size  distributions  from  the  backscatter- 
ing  kernels,  which  otherwise  would  not  be  possible.  The  constraints 
( i.e .,  smoothing,  empirical  basis  functions)  produce  a  solution  which 
is  no  longer  completely  objective  but  which  reflects  the  assumptions 
that  are  built  into  the  constraints. 

The  accuracy  of  the  solution  [f(r)]  at  a  discrete  value  of  the 
radius  is  about  ±  50%.  Therefore,  interpretation  of  features  of  the 
solution  for  specific  radii  can  lead  to  wrong  conclusions.  Features 
in  the  solution  with  widths  smaller  than  0.3  urn  in  radius  cannot  be 
resolved.  The  inferred  aerosol  size  distribution  solution  can  be  used 
for  calculating  integrated  optical  and  mechanical  properties  of  the 
particulates.  The  quality  of  the  solution  depends  on  the  applicability 
of  the  constraints  for  the  given  problem.  However,  the  inverse  problem 
cannot  be  solved  without  using  some  constraints. 

If  no  constraints  are  used  and  if  the  problems  of  instability  in 
the  inverse  process  are  ignored,  the  types  of  solutions  and  the  infor¬ 
mation  content  which  can  be  obtained  from  the  backscattered  measure¬ 
ments  will  be  apparent  in  the  analysis  of  the  natural  basis  function  of 
the  kernels.  The  backscatteri ng  kernels  contain  more  information  about 


aerosol  size  di stribution  than  do  most  other  optical  kernels.  However, 
even  if  40  optimal  kernels  with  an  accuracy  of  0.5%  are  used  for  the 
inversion  process,  solutions  such  as  Junge-type  distributions  and  log¬ 
normal  distributions,  which  are  known  to  be  present  in  the  atmosphere 
cannot  be  obtained.  The  error  in  inferring  size  distributions  at  dis¬ 
crete  radii  is  about  ±  80%. 

The  volume  backscattering  cross-sections  (to  be  used  as  the  measure- 
ment  vector  g)  can  be  inferred  from  the  solution  to  the  lidar  equation 
with  an  accuracy  of  no  better  than  10-15%.  In  cases  where  the  lidar 
measurements  are  restricted  to  relatively  short  ranges,  volume  back- 
scattering  cross-sections  can  be  inferred  with  much  better  accuracy. 
However,  the  solution,  f(r),  continues  to  be  mathematical ly  and  phys¬ 
ically  non-unique.  The  physical  non-uniqueness  is  caused  by  the  uncer¬ 
tainties  about  particulate  refractive  indices  and  particle  shapes.  Dif¬ 
ferent  combinations  of  imaginary  refractive  indices,  possible  particle 
shapes  and  orientations,  and  various  breadths  of  aerosol  size  distribu¬ 
tions  can  all  produce  similar  backscattering  measurements.  The  result¬ 
ing  aerosol  size  distribution  solution  is,  therefore,  a  plausible  but 
non-unique  solution  for  the  inversion  of  multi -wavelength  backscattered 
radiation,  or  any  other  type  of  optical  measurements. 

In  this  work,  the  measurements  used  contained  10%  random  error,  a 
magnitude  of  error  typical  of  atmospheric  measurements.  This  error  is, 
of  course,  reflected  in  the  results  ( i.e .,  20%  deviation  in  properties 
calculated  from  S0L.1  and  SOL. 2).  It  may  be  noted  that  the  deviation 
between  the  derived  solutions  and  the  true  solution  can  be  much  less  in 
a  laboratory  experiment  where  the  accuracy  *n  the  measurements  can  be 
improved  and  the  refractive  index  and  the  sphericity  of  the  particles 
can  be  control  1 ed. 
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